126,168 research outputs found

    Efficient computation of the Weighted Clustering Coefficient

    Get PDF
    The clustering coefficient of an unweighted network has been extensively used to quantify how tightly connected is the neighbor around a node and it has been widely adopted for assessing the quality of nodes in a social network. The computation of the clustering coefficient is challenging since it requires to count the number of triangles in the graph. Several recent works proposed efficient sampling, streaming and MapReduce algorithms that allow to overcome this computational bottleneck. As a matter of fact, the intensity of the interaction between nodes, that is usually represented with weights on the edges of the graph, is also an important measure of the statistical cohesiveness of a network. Recently various notions of weighted clustering coefficient have been proposed but all those techniques are hard to implement on large-scale graphs. In this work we show how standard sampling techniques can be used to obtain efficient estimators for the most commonly used measures of weighted clustering coefficient. Furthermore we also propose a novel graph-theoretic notion of clustering coefficient in weighted networks. © 2016, Copyright © Taylor & Francis Group, LL

    Random Forests and Networks Analysis

    Full text link
    D. Wilson~\cite{[Wi]} in the 1990's described a simple and efficient algorithm based on loop-erased random walks to sample uniform spanning trees and more generally weighted trees or forests spanning a given graph. This algorithm provides a powerful tool in analyzing structures on networks and along this line of thinking, in recent works~\cite{AG1,AG2,ACGM1,ACGM2} we focused on applications of spanning rooted forests on finite graphs. The resulting main conclusions are reviewed in this paper by collecting related theorems, algorithms, heuristics and numerical experiments. A first foundational part on determinantal structures and efficient sampling procedures is followed by four main applications: 1) a random-walk-based notion of well-distributed points in a graph 2) how to describe metastable dynamics in finite settings by means of Markov intertwining dualities 3) coarse graining schemes for networks and associated processes 4) wavelets-like pyramidal algorithms for graph signals.Comment: Survey pape

    Degree Ranking Using Local Information

    Get PDF
    Most real world dynamic networks are evolved very fast with time. It is not feasible to collect the entire network at any given time to study its characteristics. This creates the need to propose local algorithms to study various properties of the network. In the present work, we estimate degree rank of a node without having the entire network. The proposed methods are based on the power law degree distribution characteristic or sampling techniques. The proposed methods are simulated on synthetic networks, as well as on real world social networks. The efficiency of the proposed methods is evaluated using absolute and weighted error functions. Results show that the degree rank of a node can be estimated with high accuracy using only 1%1\% samples of the network size. The accuracy of the estimation decreases from high ranked to low ranked nodes. We further extend the proposed methods for random networks and validate their efficiency on synthetic random networks, that are generated using Erd\H{o}s-R\'{e}nyi model. Results show that the proposed methods can be efficiently used for random networks as well

    A Monte-Carlo Algorithm for Probabilistic Propagation in Belief Networks based on Importance Sampling and Stratified Simulation Techniques

    Get PDF
    A class of Monte Carlo algorithms for probability propagation in belief networks is given. The simulation is based on a two steps procedure. The first one is a node deletion technique to calculate the ’a posteriori’ distribution on a variable, with the particularity that when exact computations are too costly, they are carried out in an approximate way. In the second step, the computations done in the first one are used to obtain random configurations for the variables of interest. These configurations are weighted according to the importance sampling methodology. Different particular algorithms are obtained depending on the approximation procedure used in the first step and in the way of obtaining the random configurations. In this last case, a stratified sampling technique is used, which has been adapted to be applied to very large networks without problems with round-off errors

    Unbiased sampling of network ensembles

    Get PDF
    Sampling random graphs with given properties is a key step in the analysis of networks, as random ensembles represent basic null models required to identify patterns such as communities and motifs. An important requirement is that the sampling process is unbiased and efficient. The main approaches are microcanonical, i.e. they sample graphs that match the enforced constraints exactly. Unfortunately, when applied to strongly heterogeneous networks (like most real-world examples), the majority of these approaches become biased and/or time-consuming. Moreover, the algorithms defined in the simplest cases, such as binary graphs with given degrees, are not easily generalizable to more complicated ensembles. Here we propose a solution to the problem via the introduction of a "Maximize and Sample" ("Max & Sam" for short) method to correctly sample ensembles of networks where the constraints are `soft', i.e. realized as ensemble averages. Our method is based on exact maximum-entropy distributions and is therefore unbiased by construction, even for strongly heterogeneous networks. It is also more computationally efficient than most microcanonical alternatives. Finally, it works for both binary and weighted networks with a variety of constraints, including combined degree-strength sequences and full reciprocity structure, for which no alternative method exists. Our canonical approach can in principle be turned into an unbiased microcanonical one, via a restriction to the relevant subset. Importantly, the analysis of the fluctuations of the constraints suggests that the microcanonical and canonical versions of all the ensembles considered here are not equivalent. We show various real-world applications and provide a code implementing all our algorithms.Comment: MatLab code available at http://www.mathworks.it/matlabcentral/fileexchange/46912-max-sam-package-zi

    Weighted Random Walk Sampling for Multi-Relational Recommendation

    Full text link
    In the information overloaded web, personalized recommender systems are essential tools to help users find most relevant information. The most heavily-used recommendation frameworks assume user interactions that are characterized by a single relation. However, for many tasks, such as recommendation in social networks, user-item interactions must be modeled as a complex network of multiple relations, not only a single relation. Recently research on multi-relational factorization and hybrid recommender models has shown that using extended meta-paths to capture additional information about both users and items in the network can enhance the accuracy of recommendations in such networks. Most of this work is focused on unweighted heterogeneous networks, and to apply these techniques, weighted relations must be simplified into binary ones. However, information associated with weighted edges, such as user ratings, which may be crucial for recommendation, are lost in such binarization. In this paper, we explore a random walk sampling method in which the frequency of edge sampling is a function of edge weight, and apply this generate extended meta-paths in weighted heterogeneous networks. With this sampling technique, we demonstrate improved performance on multiple data sets both in terms of recommendation accuracy and model generation efficiency
    • …
    corecore