4,110 research outputs found

    A Faster Method to Estimate Closeness Centrality Ranking

    Get PDF
    Closeness centrality is one way of measuring how central a node is in the given network. The closeness centrality measure assigns a centrality value to each node based on its accessibility to the whole network. In real life applications, we are mainly interested in ranking nodes based on their centrality values. The classical method to compute the rank of a node first computes the closeness centrality of all nodes and then compares them to get its rank. Its time complexity is O(n⋅m+n)O(n \cdot m + n), where nn represents total number of nodes, and mm represents total number of edges in the network. In the present work, we propose a heuristic method to fast estimate the closeness rank of a node in O(α⋅m)O(\alpha \cdot m) time complexity, where α=3\alpha = 3. We also propose an extended improved method using uniform sampling technique. This method better estimates the rank and it has the time complexity O(α⋅m)O(\alpha \cdot m), where α≈10−100\alpha \approx 10-100. This is an excellent improvement over the classical centrality ranking method. The efficiency of the proposed methods is verified on real world scale-free social networks using absolute and weighted error functions

    Degree Ranking Using Local Information

    Get PDF
    Most real world dynamic networks are evolved very fast with time. It is not feasible to collect the entire network at any given time to study its characteristics. This creates the need to propose local algorithms to study various properties of the network. In the present work, we estimate degree rank of a node without having the entire network. The proposed methods are based on the power law degree distribution characteristic or sampling techniques. The proposed methods are simulated on synthetic networks, as well as on real world social networks. The efficiency of the proposed methods is evaluated using absolute and weighted error functions. Results show that the degree rank of a node can be estimated with high accuracy using only 1%1\% samples of the network size. The accuracy of the estimation decreases from high ranked to low ranked nodes. We further extend the proposed methods for random networks and validate their efficiency on synthetic random networks, that are generated using Erd\H{o}s-R\'{e}nyi model. Results show that the proposed methods can be efficiently used for random networks as well

    Fast Shortest Path Distance Estimation in Large Networks

    Full text link
    We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship

    Discriminative Distance-Based Network Indices with Application to Link Prediction

    Full text link
    In large networks, using the length of shortest paths as the distance measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs and directed graphs is controversial. The second shortcoming is that a huge number of vertices may have exactly the same score. The third shortcoming is that in many applications, the distance between two vertices not only depends on the length of shortest paths, but also on the number of shortest paths. In this paper, first we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the length of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. Second, we develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give (ϵ,δ)(\epsilon,\delta)-approximations of these indices. Third, we perform extensive experiments over several real-world networks from different domains. In our experiments, we first show that compared to the traditional indices, discriminative indices have usually much more discriminability. Then, we show that our randomized algorithms can very precisely estimate average discriminative path length and average discriminative eccentricity, using only few samples. Then, we show that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). Fourth, in order to better motivate the usefulness of our proposed distance measure, we present a novel link prediction method, that uses discriminative distance to decide which vertices are more likely to form a link in future, and show its superior performance compared to the well-known existing measures

    Local Ranking Problem on the BrowseGraph

    Full text link
    The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has received more and more attention in many different tasks such as ranking, prediction and recommendation. However, a web-server has only the browsing traffic performed on its pages (local BrowseGraph) and, as a consequence, the local computation can lead to estimation errors, which hinders the increasing number of applications in the state of the art. Also, although the divergence between the local and global ranks has been measured, the possibility of estimating such divergence using only local knowledge has been mainly overlooked. These aspects are of great interest for online service providers who want to: (i) gauge their ability to correctly assess the importance of their resources only based on their local knowledge, and (ii) take into account real user browsing fluxes that better capture the actual user interest than the static hyperlink network. We study the LRP problem on a BrowseGraph from a large news provider, considering as subgraphs the aggregations of browsing traces of users coming from different domains. We show that the distance between rankings can be accurately predicted based only on structural information of the local graph, being able to achieve an average rank correlation as high as 0.8
    • …
    corecore