215 research outputs found
Block modelling in dynamic networks with non-homogeneous Poisson processes and exact ICL
We develop a model in which interactions between nodes of a dynamic network
are counted by non homogeneous Poisson processes. In a block modelling
perspective, nodes belong to hidden clusters (whose number is unknown) and the
intensity functions of the counting processes only depend on the clusters of
nodes. In order to make inference tractable we move to discrete time by
partitioning the entire time horizon in which interactions are observed in
fixed-length time sub-intervals. First, we derive an exact integrated
classification likelihood criterion and maximize it relying on a greedy search
approach. This allows to estimate the memberships to clusters and the number of
clusters simultaneously. Then a maximum-likelihood estimator is developed to
estimate non parametrically the integrated intensities. We discuss the
over-fitting problems of the model and propose a regularized version solving
these issues. Experiments on real and simulated data are carried out in order
to assess the proposed methodology
Community Detection in Complex Networks
The stochastic block model is a powerful tool for inferring community structure from network topology. However, the simple block model considers community structure as the only underlying attribute for forming the relational interactions among the nodes, this makes it prefer a Poisson degree distribution within each community, while most real-world networks have a heavy-tailed degree distribution. This is essentially because the simple assumption under the traditional block model is not consistent with some real-world circumstances where factors other than the community memberships such as overall popularity also heavily affect the pattern of the relational interactions. The degree-corrected block model can accommodate arbitrary degree distributions within communities by taking nodes\u27 popularity or degree into account. But since it takes the vertex degrees as parameters rather than generating them, it cannot use them to help it classify the vertices, and its natural generalization to directed graphs cannot even use the orientations of the edges. We developed several variants of the block model with the best of both worlds: they can use vertex degrees and edge orientations in the classification process, while tolerating heavy-tailed degree distributions within communities. We show that for some networks, including synthetic networks and networks of word adjacencies in English text, these new block models achieve a higher accuracy than either standard or degree-corrected block models. Another part of my work is to develop even more generalized block models, which incorporates other attributes of the nodes. Many data sets contain rich information about objects, as well as pairwise relations between them. For instance, in networks of websites, scientific papers, patents and other documents, each node has content consisting of a collection of words, as well as hyperlinks or citations to other nodes. In order to perform inference on such data sets, and make predictions and recommendations, it is useful to have models that are able to capture the processes which generate the text at each node as well as the links between them. Our work combines classic ideas in topic modeling with a variant of the mixed-membership block model recently developed in the statistical physics community. The resulting model has the advantage that its parameters, including the mixture of topics of each document and the resulting overlapping communities, can be inferred with a simple and scalable expectation- maximization algorithm. We test our model on three data sets, performing unsupervised topic classification and link prediction. For both tasks, our model outperforms several existing state-of-the-art methods, achieving higher accuracy with significantly less computation, analyzing a data set with 1.3 million words and 44 thousand links in a few minutes
A generative model for reciprocity and community detection in networks
We present a probabilistic generative model and efficient algorithm to model
reciprocity in directed networks. Unlike other methods that address this
problem such as exponential random graphs, it assigns latent variables as
community memberships to nodes and a reciprocity parameter to the whole network
rather than fitting order statistics. It formalizes the assumption that a
directed interaction is more likely to occur if an individual has already
observed an interaction towards her. It provides a natural framework for
relaxing the common assumption in network generative models of conditional
independence between edges, and it can be used to perform inference tasks such
as predicting the existence of an edge given the observation of an edge in the
reverse direction. Inference is performed using an efficient
expectation-maximization algorithm that exploits the sparsity of the network,
leading to an efficient and scalable implementation. We illustrate these
findings by analyzing synthetic and real data, including social networks,
academic citations and the Erasmus student exchange program. Our method
outperforms others in both predicting edges and generating networks that
reflect the reciprocity values observed in real data, while at the same time
inferring an underlying community structure. We provide an open-source
implementation of the code online
- …