126,168 research outputs found
Efficient computation of the Weighted Clustering Coefficient
The clustering coefficient of an unweighted network has been extensively used to quantify how tightly connected is the neighbor around a node and it has been widely adopted for assessing the quality of nodes in a social network. The computation of the clustering coefficient is challenging since it requires to count the number of triangles in the graph. Several recent works proposed efficient sampling, streaming and MapReduce algorithms that allow to overcome this computational bottleneck. As a matter of fact, the intensity of the interaction between nodes, that is usually represented with weights on the edges of the graph, is also an important measure of the statistical cohesiveness of a network. Recently various notions of weighted clustering coefficient have been proposed but all those techniques are hard to implement on large-scale graphs. In this work we show how standard sampling techniques can be used to obtain efficient estimators for the most commonly used measures of weighted clustering coefficient. Furthermore we also propose a novel graph-theoretic notion of clustering coefficient in weighted networks. © 2016, Copyright © Taylor & Francis Group, LL
Random Forests and Networks Analysis
D. Wilson~\cite{[Wi]} in the 1990's described a simple and efficient
algorithm based on loop-erased random walks to sample uniform spanning trees
and more generally weighted trees or forests spanning a given graph. This
algorithm provides a powerful tool in analyzing structures on networks and
along this line of thinking, in recent works~\cite{AG1,AG2,ACGM1,ACGM2} we
focused on applications of spanning rooted forests on finite graphs. The
resulting main conclusions are reviewed in this paper by collecting related
theorems, algorithms, heuristics and numerical experiments. A first
foundational part on determinantal structures and efficient sampling procedures
is followed by four main applications: 1) a random-walk-based notion of
well-distributed points in a graph 2) how to describe metastable dynamics in
finite settings by means of Markov intertwining dualities 3) coarse graining
schemes for networks and associated processes 4) wavelets-like pyramidal
algorithms for graph signals.Comment: Survey pape
Degree Ranking Using Local Information
Most real world dynamic networks are evolved very fast with time. It is not
feasible to collect the entire network at any given time to study its
characteristics. This creates the need to propose local algorithms to study
various properties of the network. In the present work, we estimate degree rank
of a node without having the entire network. The proposed methods are based on
the power law degree distribution characteristic or sampling techniques. The
proposed methods are simulated on synthetic networks, as well as on real world
social networks. The efficiency of the proposed methods is evaluated using
absolute and weighted error functions. Results show that the degree rank of a
node can be estimated with high accuracy using only samples of the
network size. The accuracy of the estimation decreases from high ranked to low
ranked nodes. We further extend the proposed methods for random networks and
validate their efficiency on synthetic random networks, that are generated
using Erd\H{o}s-R\'{e}nyi model. Results show that the proposed methods can be
efficiently used for random networks as well
A Monte-Carlo Algorithm for Probabilistic Propagation in Belief Networks based on Importance Sampling and Stratified Simulation Techniques
A class of Monte Carlo algorithms for probability propagation in belief networks is given.
The simulation is based on a two steps procedure. The first one is a node deletion technique
to calculate the ’a posteriori’ distribution on a variable, with the particularity that when
exact computations are too costly, they are carried out in an approximate way. In the second
step, the computations done in the first one are used to obtain random configurations for the
variables of interest. These configurations are weighted according to the importance sampling
methodology. Different particular algorithms are obtained depending on the approximation
procedure used in the first step and in the way of obtaining the random configurations. In
this last case, a stratified sampling technique is used, which has been adapted to be applied
to very large networks without problems with round-off errors
Unbiased sampling of network ensembles
Sampling random graphs with given properties is a key step in the analysis of
networks, as random ensembles represent basic null models required to identify
patterns such as communities and motifs. An important requirement is that the
sampling process is unbiased and efficient. The main approaches are
microcanonical, i.e. they sample graphs that match the enforced constraints
exactly. Unfortunately, when applied to strongly heterogeneous networks (like
most real-world examples), the majority of these approaches become biased
and/or time-consuming. Moreover, the algorithms defined in the simplest cases,
such as binary graphs with given degrees, are not easily generalizable to more
complicated ensembles. Here we propose a solution to the problem via the
introduction of a "Maximize and Sample" ("Max & Sam" for short) method to
correctly sample ensembles of networks where the constraints are `soft', i.e.
realized as ensemble averages. Our method is based on exact maximum-entropy
distributions and is therefore unbiased by construction, even for strongly
heterogeneous networks. It is also more computationally efficient than most
microcanonical alternatives. Finally, it works for both binary and weighted
networks with a variety of constraints, including combined degree-strength
sequences and full reciprocity structure, for which no alternative method
exists. Our canonical approach can in principle be turned into an unbiased
microcanonical one, via a restriction to the relevant subset. Importantly, the
analysis of the fluctuations of the constraints suggests that the
microcanonical and canonical versions of all the ensembles considered here are
not equivalent. We show various real-world applications and provide a code
implementing all our algorithms.Comment: MatLab code available at
http://www.mathworks.it/matlabcentral/fileexchange/46912-max-sam-package-zi
Weighted Random Walk Sampling for Multi-Relational Recommendation
In the information overloaded web, personalized recommender systems are
essential tools to help users find most relevant information. The most
heavily-used recommendation frameworks assume user interactions that are
characterized by a single relation. However, for many tasks, such as
recommendation in social networks, user-item interactions must be modeled as a
complex network of multiple relations, not only a single relation. Recently
research on multi-relational factorization and hybrid recommender models has
shown that using extended meta-paths to capture additional information about
both users and items in the network can enhance the accuracy of recommendations
in such networks. Most of this work is focused on unweighted heterogeneous
networks, and to apply these techniques, weighted relations must be simplified
into binary ones. However, information associated with weighted edges, such as
user ratings, which may be crucial for recommendation, are lost in such
binarization. In this paper, we explore a random walk sampling method in which
the frequency of edge sampling is a function of edge weight, and apply this
generate extended meta-paths in weighted heterogeneous networks. With this
sampling technique, we demonstrate improved performance on multiple data sets
both in terms of recommendation accuracy and model generation efficiency
- …