182 research outputs found
A Bag-of-Paths Node Criticality Measure
This work compares several node (and network) criticality measures
quantifying to which extend each node is critical with respect to the
communication flow between nodes of the network, and introduces a new measure
based on the Bag-of-Paths (BoP) framework. Network disconnection simulation
experiments show that the new BoP measure outperforms all the other measures on
a sample of Erdos-Renyi and Albert-Barabasi graphs. Furthermore, a faster
(still O(n^3)), approximate, BoP criticality relying on the Sherman-Morrison
rank-one update of a matrix is introduced for tackling larger networks. This
approximate measure shows similar performances as the original, exact, one
Developments in the theory of randomized shortest paths with a comparison of graph node distances
There have lately been several suggestions for parametrized distances on a
graph that generalize the shortest path distance and the commute time or
resistance distance. The need for developing such distances has risen from the
observation that the above-mentioned common distances in many situations fail
to take into account the global structure of the graph. In this article, we
develop the theory of one family of graph node distances, known as the
randomized shortest path dissimilarity, which has its foundation in statistical
physics. We show that the randomized shortest path dissimilarity can be easily
computed in closed form for all pairs of nodes of a graph. Moreover, we come up
with a new definition of a distance measure that we call the free energy
distance. The free energy distance can be seen as an upgrade of the randomized
shortest path dissimilarity as it defines a metric, in addition to which it
satisfies the graph-geodetic property. The derivation and computation of the
free energy distance are also straightforward. We then make a comparison
between a set of generalized distances that interpolate between the shortest
path distance and the commute time, or resistance distance. This comparison
focuses on the applicability of the distances in graph node clustering and
classification. The comparison, in general, shows that the parametrized
distances perform well in the tasks. In particular, we see that the results
obtained with the free energy distance are among the best in all the
experiments.Comment: 30 pages, 4 figures, 3 table
Two betweenness centrality measures based on Randomized Shortest Paths
This paper introduces two new closely related betweenness centrality measures
based on the Randomized Shortest Paths (RSP) framework, which fill a gap
between traditional network centrality measures based on shortest paths and
more recent methods considering random walks or current flows. The framework
defines Boltzmann probability distributions over paths of the network which
focus on the shortest paths, but also take into account longer paths depending
on an inverse temperature parameter. RSP's have previously proven to be useful
in defining distance measures on networks. In this work we study their utility
in quantifying the importance of the nodes of a network. The proposed RSP
betweenness centralities combine, in an optimal way, the ideas of using the
shortest and purely random paths for analysing the roles of network nodes,
avoiding issues involving these two paradigms. We present the derivations of
these measures and how they can be computed in an efficient way. In addition,
we show with real world examples the potential of the RSP betweenness
centralities in identifying interesting nodes of a network that more
traditional methods might fail to notice.Comment: Minor updates; published in Scientific Report
A maximum entropy approach to multiple classifiers combination
In this paper,we present amaximumentropy (maxent) approach to the fusion
of experts opinions, or classifiers outputs, problem. Themaxent approach is quite
versatile and allows us to express in a clear, rigorous,way the a priori knowledge
that is available on the problem. For instance, our knowledge about the reliability
of the experts and the correlations between these experts can be easily integrated:
Each piece of knowledge is expressed in the form of a linear constraint.
An iterative scaling algorithm is used in order to compute the maxent solution
of the problem. The maximum entropy method seeks the joint probability density
of a set of random variables that has maximum entropy while satisfying the
constraints. It is therefore the âmost honestâ characterization of our knowledge
given the available facts (constraints). In the case of conflicting constraints, we
propose to minimise the âlack of constraints satisfactionâ or to relax some constraints
and recompute the maximum entropy solution. The maxent fusion rule
is illustrated by some simulations
Covariance and Correlation Kernels on a Graph in the Generalized Bag-of-Paths Formalism
This work derives closed-form expressions computing the expectation of
co-presence and of number of co-occurrences of nodes on paths sampled from a
network according to general path weights (a bag of paths). The underlying idea
is that two nodes are considered as similar when they often appear together on
(preferably short) paths of the network. The different expressions are obtained
for both regular and hitting paths and serve as a basis for computing new
covariance and correlation measures between nodes, which are valid positive
semi-definite kernels on a graph. Experiments on semi-supervised classification
problems show that the introduced similarity measures provide competitive
results compared to other state-of-the-art distance and similarity measures
between nodes
Randomized Optimal Transport on a Graph: framework and new distance measures
The recently developed bag-of-paths (BoP) framework consists in setting a
Gibbs-Boltzmann distribution on all feasible paths of a graph. This probability
distribution favors short paths over long ones, with a free parameter (the
temperature ) controlling the entropic level of the distribution. This
formalism enables the computation of new distances or dissimilarities,
interpolating between the shortest-path and the resistance distance, which have
been shown to perform well in clustering and classification tasks. In this
work, the bag-of-paths formalism is extended by adding two independent equality
constraints fixing starting and ending nodes distributions of paths (margins).
When the temperature is low, this formalism is shown to be equivalent to a
relaxation of the optimal transport problem on a network where paths carry a
flow between two discrete distributions on nodes. The randomization is achieved
by considering free energy minimization instead of traditional cost
minimization. Algorithms computing the optimal free energy solution are
developed for two types of paths: hitting (or absorbing) paths and non-hitting,
regular, paths, and require the inversion of an matrix with
being the number of nodes. Interestingly, for regular paths on an undirected
graph, the resulting optimal policy interpolates between the deterministic
optimal transport policy () and the solution to the
corresponding electrical circuit (). Two distance
measures between nodes and a dissimilarity between groups of nodes, both
integrating weights on nodes, are derived from this framework.Comment: Preprint paper to appear in Network Science journal, Cambridge
University Pres
Sparse Randomized Shortest Paths Routing with Tsallis Divergence Regularization
This work elaborates on the important problem of (1) designing optimal
randomized routing policies for reaching a target node t from a source note s
on a weighted directed graph G and (2) defining distance measures between nodes
interpolating between the least cost (based on optimal movements) and the
commute-cost (based on a random walk on G), depending on a temperature
parameter T. To this end, the randomized shortest path formalism (RSP,
[2,99,124]) is rephrased in terms of Tsallis divergence regularization, instead
of Kullback-Leibler divergence. The main consequence of this change is that the
resulting routing policy (local transition probabilities) becomes sparser when
T decreases, therefore inducing a sparse random walk on G converging to the
least-cost directed acyclic graph when T tends to 0. Experimental comparisons
on node clustering and semi-supervised classification tasks show that the
derived dissimilarity measures based on expected routing costs provide
state-of-the-art results. The sparse RSP is therefore a promising model of
movements on a graph, balancing sparse exploitation and exploration in an
optimal way
Randomized Shortest Paths with Net Flows and Capacity Constraints
This work extends the randomized shortest paths (RSP) model by investigating
the net flow RSP and adding capacity constraints on edge flows. The standard
RSP is a model of movement, or spread, through a network interpolating between
a random-walk and a shortest-path behavior [30, 42, 49]. The framework assumes
a unit flow injected into a source node and collected from a target node with
flows minimizing the expected transportation cost, together with a relative
entropy regularization term. In this context, the present work first develops
the net flow RSP model considering that edge flows in opposite directions
neutralize each other (as in electric networks), and proposes an algorithm for
computing the expected routing costs between all pairs of nodes. This quantity
is called the net flow RSP dissimilarity measure between nodes. Experimental
comparisons on node clustering tasks indicate that the net flow RSP
dissimilarity is competitive with other state-of-the-art dissimilarities. In
the second part of the paper, it is shown how to introduce capacity constraints
on edge flows, and a procedure is developed to solve this constrained problem
by exploiting Lagrangian duality. These two extensions should improve
significantly the scope of applications of the RSP framework
Maximum likelihood estimation for randomized shortest paths with trajectory data
Randomized shortest paths (RSPs) are tool developed in recent years for different graph and network analysis applications, such as modelling movement or flow in networks. In essence, the RSP framework considers the temperature-dependent GibbsâBoltzmann distribution over paths in the network. At low temperatures, the distribution focuses solely on the shortest or least-cost paths, while with increasing temperature, the distribution spreads over random walks on the network. Many relevant quantities can be computed conveniently from this distribution, and these often generalize traditional network measures in a sensible way. However, when modelling real phenomena with RSPs, one needs a principled way of estimating the parameters from data. In this work, we develop methods for computing the maximum likelihood estimate of the model parameters, with focus on the temperature parameter, when modelling phenomena based on movement, flow or spreading processes. We test the validity of the derived methods with trajectories generated on artificial networks as well as with real data on the movement of wild reindeer in a geographic landscape, used for estimating the degree of randomness in the movement of the animals. These examples demonstrate the attractiveness of the RSP framework as a generic model to be used in diverse applications. randomized shortest paths; random walk; shortest path; parameter estimation; maximum likelihood; animal movement modellingpublishedVersio
- âŠ