67,940 research outputs found

    When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks

    Full text link
    We introduce a framework for the modeling of sequential data capturing pathways of varying lengths observed in a network. Such data are important, e.g., when studying click streams in information networks, travel patterns in transportation systems, information cascades in social networks, biological pathways or time-stamped social interactions. While it is common to apply graph analytics and network analysis to such data, recent works have shown that temporal correlations can invalidate the results of such methods. This raises a fundamental question: when is a network abstraction of sequential data justified? Addressing this open question, we propose a framework which combines Markov chains of multiple, higher orders into a multi-layer graphical model that captures temporal correlations in pathways at multiple length scales simultaneously. We develop a model selection technique to infer the optimal number of layers of such a model and show that it outperforms previously used Markov order detection techniques. An application to eight real-world data sets on pathways and temporal networks shows that it allows to infer graphical models which capture both topological and temporal characteristics of such data. Our work highlights fallacies of network abstractions and provides a principled answer to the open question when they are justified. Generalizing network representations to multi-order graphical models, it opens perspectives for new data mining and knowledge discovery algorithms.Comment: 10 pages, 4 figures, 1 table, companion python package pathpy available on gitHu

    On Ordinal Invariants in Well Quasi Orders and Finite Antichain Orders

    Full text link
    We investigate the ordinal invariants height, length, and width of well quasi orders (WQO), with particular emphasis on width, an invariant of interest for the larger class of orders with finite antichain condition (FAC). We show that the width in the class of FAC orders is completely determined by the width in the class of WQOs, in the sense that if we know how to calculate the width of any WQO then we have a procedure to calculate the width of any given FAC order. We show how the width of WQO orders obtained via some classical constructions can sometimes be computed in a compositional way. In particular, this allows proving that every ordinal can be obtained as the width of some WQO poset. One of the difficult questions is to give a complete formula for the width of Cartesian products of WQOs. Even the width of the product of two ordinals is only known through a complex recursive formula. Although we have not given a complete answer to this question we have advanced the state of knowledge by considering some more complex special cases and in particular by calculating the width of certain products containing three factors. In the course of writing the paper we have discovered that some of the relevant literature was written on cross-purposes and some of the notions re-discovered several times. Therefore we also use the occasion to give a unified presentation of the known results

    Permutation Models for Collaborative Ranking

    Full text link
    We study the problem of collaborative filtering where ranking information is available. Focusing on the core of the collaborative ranking process, the user and their community, we propose new models for representation of the underlying permutations and prediction of ranks. The first approach is based on the assumption that the user makes successive choice of items in a stage-wise manner. In particular, we extend the Plackett-Luce model in two ways - introducing parameter factoring to account for user-specific contribution, and modelling the latent community in a generative setting. The second approach relies on log-linear parameterisation, which relaxes the discrete-choice assumption, but makes learning and inference much more involved. We propose MCMC-based learning and inference methods and derive linear-time prediction algorithms

    Evaluating Variable Length Markov Chain Models for Analysis of User Web Navigation Sessions

    Full text link
    Markov models have been widely used to represent and analyse user web navigation data. In previous work we have proposed a method to dynamically extend the order of a Markov chain model and a complimentary method for assessing the predictive power of such a variable length Markov chain. Herein, we review these two methods and propose a novel method for measuring the ability of a variable length Markov model to summarise user web navigation sessions up to a given length. While the summarisation ability of a model is important to enable the identification of user navigation patterns, the ability to make predictions is important in order to foresee the next link choice of a user after following a given trail so as, for example, to personalise a web site. We present an extensive experimental evaluation providing strong evidence that prediction accuracy increases linearly with summarisation ability

    Model Theoretic Complexity of Automatic Structures

    Get PDF
    We study the complexity of automatic structures via well-established concepts from both logic and model theory, including ordinal heights (of well-founded relations), Scott ranks of structures, and Cantor-Bendixson ranks (of trees). We prove the following results: 1) The ordinal height of any automatic well- founded partial order is bounded by \omega^\omega ; 2) The ordinal heights of automatic well-founded relations are unbounded below the first non-computable ordinal; 3) For any computable ordinal there is an automatic structure of Scott rank at least that ordinal. Moreover, there are automatic structures of Scott rank the first non-computable ordinal and its successor; 4) For any computable ordinal, there is an automatic successor tree of Cantor-Bendixson rank that ordinal.Comment: 23 pages. Extended abstract appeared in Proceedings of TAMC '08, LNCS 4978 pp 514-52

    Temporal Ordered Clustering in Dynamic Networks: Unsupervised and Semi-supervised Learning Algorithms

    Full text link
    In temporal ordered clustering, given a single snapshot of a dynamic network in which nodes arrive at distinct time instants, we aim at partitioning its nodes into KK ordered clusters C1≺⋯≺CK\mathcal{C}_1 \prec \cdots \prec \mathcal{C}_K such that for i<ji<j, nodes in cluster Ci\mathcal{C}_i arrived before nodes in cluster Cj\mathcal{C}_j, with KK being a data-driven parameter and not known upfront. Such a problem is of considerable significance in many applications ranging from tracking the expansion of fake news to mapping the spread of information. We first formulate our problem for a general dynamic graph, and propose an integer programming framework that finds the optimal clustering, represented as a strict partial order set, achieving the best precision (i.e., fraction of successfully ordered node pairs) for a fixed density (i.e., fraction of comparable node pairs). We then develop a sequential importance procedure and design unsupervised and semi-supervised algorithms to find temporal ordered clusters that efficiently approximate the optimal solution. To illustrate the techniques, we apply our methods to the vertex copying (duplication-divergence) model which exhibits some edge-case challenges in inferring the clusters as compared to other network models. Finally, we validate the performance of the proposed algorithms on synthetic and real-world networks.Comment: 14 pages, 9 figures, and 3 tables. This version is submitted to a journal. A shorter version of this work is published in the proceedings of IEEE International Symposium on Information Theory (ISIT), 2020. The first two authors contributed equall
    • …
    corecore