67,940 research outputs found
When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks
We introduce a framework for the modeling of sequential data capturing
pathways of varying lengths observed in a network. Such data are important,
e.g., when studying click streams in information networks, travel patterns in
transportation systems, information cascades in social networks, biological
pathways or time-stamped social interactions. While it is common to apply graph
analytics and network analysis to such data, recent works have shown that
temporal correlations can invalidate the results of such methods. This raises a
fundamental question: when is a network abstraction of sequential data
justified? Addressing this open question, we propose a framework which combines
Markov chains of multiple, higher orders into a multi-layer graphical model
that captures temporal correlations in pathways at multiple length scales
simultaneously. We develop a model selection technique to infer the optimal
number of layers of such a model and show that it outperforms previously used
Markov order detection techniques. An application to eight real-world data sets
on pathways and temporal networks shows that it allows to infer graphical
models which capture both topological and temporal characteristics of such
data. Our work highlights fallacies of network abstractions and provides a
principled answer to the open question when they are justified. Generalizing
network representations to multi-order graphical models, it opens perspectives
for new data mining and knowledge discovery algorithms.Comment: 10 pages, 4 figures, 1 table, companion python package pathpy
available on gitHu
On Ordinal Invariants in Well Quasi Orders and Finite Antichain Orders
We investigate the ordinal invariants height, length, and width of well quasi
orders (WQO), with particular emphasis on width, an invariant of interest for
the larger class of orders with finite antichain condition (FAC). We show that
the width in the class of FAC orders is completely determined by the width in
the class of WQOs, in the sense that if we know how to calculate the width of
any WQO then we have a procedure to calculate the width of any given FAC order.
We show how the width of WQO orders obtained via some classical constructions
can sometimes be computed in a compositional way. In particular, this allows
proving that every ordinal can be obtained as the width of some WQO poset. One
of the difficult questions is to give a complete formula for the width of
Cartesian products of WQOs. Even the width of the product of two ordinals is
only known through a complex recursive formula. Although we have not given a
complete answer to this question we have advanced the state of knowledge by
considering some more complex special cases and in particular by calculating
the width of certain products containing three factors. In the course of
writing the paper we have discovered that some of the relevant literature was
written on cross-purposes and some of the notions re-discovered several times.
Therefore we also use the occasion to give a unified presentation of the known
results
Permutation Models for Collaborative Ranking
We study the problem of collaborative filtering where ranking information is
available. Focusing on the core of the collaborative ranking process, the user
and their community, we propose new models for representation of the underlying
permutations and prediction of ranks. The first approach is based on the
assumption that the user makes successive choice of items in a stage-wise
manner. In particular, we extend the Plackett-Luce model in two ways -
introducing parameter factoring to account for user-specific contribution, and
modelling the latent community in a generative setting. The second approach
relies on log-linear parameterisation, which relaxes the discrete-choice
assumption, but makes learning and inference much more involved. We propose
MCMC-based learning and inference methods and derive linear-time prediction
algorithms
Evaluating Variable Length Markov Chain Models for Analysis of User Web Navigation Sessions
Markov models have been widely used to represent and analyse user web
navigation data. In previous work we have proposed a method to dynamically
extend the order of a Markov chain model and a complimentary method for
assessing the predictive power of such a variable length Markov chain. Herein,
we review these two methods and propose a novel method for measuring the
ability of a variable length Markov model to summarise user web navigation
sessions up to a given length. While the summarisation ability of a model is
important to enable the identification of user navigation patterns, the ability
to make predictions is important in order to foresee the next link choice of a
user after following a given trail so as, for example, to personalise a web
site. We present an extensive experimental evaluation providing strong evidence
that prediction accuracy increases linearly with summarisation ability
Model Theoretic Complexity of Automatic Structures
We study the complexity of automatic structures via well-established concepts
from both logic and model theory, including ordinal heights (of well-founded
relations), Scott ranks of structures, and Cantor-Bendixson ranks (of trees).
We prove the following results: 1) The ordinal height of any automatic well-
founded partial order is bounded by \omega^\omega ; 2) The ordinal heights of
automatic well-founded relations are unbounded below the first non-computable
ordinal; 3) For any computable ordinal there is an automatic structure of Scott
rank at least that ordinal. Moreover, there are automatic structures of Scott
rank the first non-computable ordinal and its successor; 4) For any computable
ordinal, there is an automatic successor tree of Cantor-Bendixson rank that
ordinal.Comment: 23 pages. Extended abstract appeared in Proceedings of TAMC '08, LNCS
4978 pp 514-52
Temporal Ordered Clustering in Dynamic Networks: Unsupervised and Semi-supervised Learning Algorithms
In temporal ordered clustering, given a single snapshot of a dynamic network
in which nodes arrive at distinct time instants, we aim at partitioning its
nodes into ordered clusters such that for , nodes in cluster arrived
before nodes in cluster , with being a data-driven parameter
and not known upfront. Such a problem is of considerable significance in many
applications ranging from tracking the expansion of fake news to mapping the
spread of information. We first formulate our problem for a general dynamic
graph, and propose an integer programming framework that finds the optimal
clustering, represented as a strict partial order set, achieving the best
precision (i.e., fraction of successfully ordered node pairs) for a fixed
density (i.e., fraction of comparable node pairs). We then develop a sequential
importance procedure and design unsupervised and semi-supervised algorithms to
find temporal ordered clusters that efficiently approximate the optimal
solution. To illustrate the techniques, we apply our methods to the vertex
copying (duplication-divergence) model which exhibits some edge-case challenges
in inferring the clusters as compared to other network models. Finally, we
validate the performance of the proposed algorithms on synthetic and real-world
networks.Comment: 14 pages, 9 figures, and 3 tables. This version is submitted to a
journal. A shorter version of this work is published in the proceedings of
IEEE International Symposium on Information Theory (ISIT), 2020. The first
two authors contributed equall
- …