7,615 research outputs found

    Temporal Ordered Clustering in Dynamic Networks: Unsupervised and Semi-supervised Learning Algorithms

    Full text link
    In temporal ordered clustering, given a single snapshot of a dynamic network in which nodes arrive at distinct time instants, we aim at partitioning its nodes into KK ordered clusters C1≺⋯≺CK\mathcal{C}_1 \prec \cdots \prec \mathcal{C}_K such that for i<ji<j, nodes in cluster Ci\mathcal{C}_i arrived before nodes in cluster Cj\mathcal{C}_j, with KK being a data-driven parameter and not known upfront. Such a problem is of considerable significance in many applications ranging from tracking the expansion of fake news to mapping the spread of information. We first formulate our problem for a general dynamic graph, and propose an integer programming framework that finds the optimal clustering, represented as a strict partial order set, achieving the best precision (i.e., fraction of successfully ordered node pairs) for a fixed density (i.e., fraction of comparable node pairs). We then develop a sequential importance procedure and design unsupervised and semi-supervised algorithms to find temporal ordered clusters that efficiently approximate the optimal solution. To illustrate the techniques, we apply our methods to the vertex copying (duplication-divergence) model which exhibits some edge-case challenges in inferring the clusters as compared to other network models. Finally, we validate the performance of the proposed algorithms on synthetic and real-world networks.Comment: 14 pages, 9 figures, and 3 tables. This version is submitted to a journal. A shorter version of this work is published in the proceedings of IEEE International Symposium on Information Theory (ISIT), 2020. The first two authors contributed equall

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

    Modularity and anti-modularity in networks with arbitrary degree distribution

    Get PDF
    Networks describing the interaction of the elements that constitute a complex system grow and develop via a number of different mechanisms, such as the addition and deletion of nodes, the addition and deletion of edges, as well as the duplication or fusion of nodes. While each of these mechanisms can have a different cause depending on whether the network is biological, technological, or social, their impact on the network's structure, as well as its local and global properties, is similar. This allows us to study how each of these mechanisms affects networks either alone or together with the other processes, and how they shape the characteristics that have been observed. We study how a network's growth parameters impact the distribution of edges in the network, how they affect a network's modularity, and point out that some parameters will give rise to networks that have the opposite tendency, namely to display anti-modularity. Within the model we are describing, we can search the space of possible networks for parameter sets that generate networks that are very similar to well-known and well-studied examples, such as the brain of a worm, and the network of interactions of the proteins in baker's yeast.Comment: 23 pages. 13 figures, 1 table. Includes Supplementary tex
    • …
    corecore