8 research outputs found

    Fragmentation Coagulation Based Mixed Membership Stochastic Blockmodel

    Full text link
    The Mixed-Membership Stochastic Blockmodel~(MMSB) is proposed as one of the state-of-the-art Bayesian relational methods suitable for learning the complex hidden structure underlying the network data. However, the current formulation of MMSB suffers from the following two issues: (1), the prior information~(e.g. entities' community structural information) can not be well embedded in the modelling; (2), community evolution can not be well described in the literature. Therefore, we propose a non-parametric fragmentation coagulation based Mixed Membership Stochastic Blockmodel (fcMMSB). Our model performs entity-based clustering to capture the community information for entities and linkage-based clustering to derive the group information for links simultaneously. Besides, the proposed model infers the network structure and models community evolution, manifested by appearances and disappearances of communities, using the discrete fragmentation coagulation process (DFCP). By integrating the community structure with the group compatibility matrix we derive a generalized version of MMSB. An efficient Gibbs sampling scheme with Polya Gamma (PG) approach is implemented for posterior inference. We validate our model on synthetic and real world data.Comment: AAAI 202

    A Poisson Gamma Probabilistic Model for Latent Node-group Memberships in Dynamic Networks

    No full text
    We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed. The latent memberships evolve according to a Markov process. The optimal number of latent groups can be determined by data itself. The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data. We present batch and online Gibbs sampling algorithms to perform model inference. Finally, we demonstrate the model’s performance on both synthetic and real-world datasets compared to state-of-the-art methods

    A Poisson Gamma Probabilistic Model for Latent Node-group Memberships in Dynamic Networks

    No full text
    We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed. The latent memberships evolve according to a Markov process. The optimal number of latent groups can be determined by data itself. The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data. We present batch and online Gibbs sampling algorithms to perform model inference. Finally, we demonstrate the model’s performance on both synthetic and real-world datasets compared to state-of-the-art methods

    A Poisson Gamma Probabilistic Model for Latent Node-group Memberships in Dynamic Networks

    No full text
    <p>We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed. The latent memberships evolve according to a Markov process. The optimal number of latent groups can be determined by data itself. The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data. We present batch and online Gibbs sampling algorithms to perform model inference. Finally, we demonstrate the model’s performance on both synthetic and real-world datasets compared to state-of-the-art methods.</p


    Get PDF
    Thesis (Ph.D.) - Indiana University, Luddy School of Informatics, Computing, and Engineering/University Graduate School, 2020Community detection has always been one of the fundamental research topics in graph mining. As a type of unsupervised or semi-supervised approach, community detection aims to explore node high-order closeness by leveraging graph topological structure. By grouping similar nodes or edges into the same community while separating dissimilar ones apart into different communities, graph structure can be revealed in a coarser resolution. It can be beneficial for numerous applications such as user shopping recommendation and advertisement in e-commerce, protein-protein interaction prediction in the bioinformatics, and literature recommendation or scholar collaboration in citation analysis. However, identifying communities is an ill-defined problem. Due to the No Free Lunch theorem [1], there is neither gold standard to represent perfect community partition nor universal methods that are able to detect satisfied communities for all tasks under various types of graphs. To have a global view of this research topic, I summarize state-of-art community detection methods by categorizing them based on graph types, research tasks and methodology frameworks. As academic exploration on community detection grows rapidly in recent years, I hereby particularly focus on the state-of-art works published in the latest decade, which may leave out some classic models published decades ago. Meanwhile, three subtle community detection tasks are proposed and assessed in this dissertation as well. First, apart from general models which consider only graph structures, personalized community detection considers user need as auxiliary information to guide community detection. In the end, there will be fine-grained communities for nodes better matching user needs while coarser-resolution communities for the rest of less relevant nodes. Second, graphs always suffer from the sparse connectivity issue. Leveraging conventional models directly on such graphs may hugely distort the quality of generate communities. To tackle such a problem, cross-graph techniques are involved to propagate external graph information as a support for target graph community detection. Third, graph community structure supports a natural language processing (NLP) task to depict node intrinsic characteristics by generating node summarizations via a text generative model. The contribution of this dissertation is threefold. First, a decent amount of researches are reviewed and summarized under a well-defined taxonomy. Existing works about methods, evaluation and applications are all addressed in the literature review. Second, three novel community detection tasks are demonstrated and associated models are proposed and evaluated by comparing with state-of-art baselines under various datasets. Third, the limitations of current works are pointed out and future research tracks with potentials are discussed as well