152 research outputs found

    ASAP : towards accurate, stable and accelerative penetrating-rank estimation on large graphs

    Get PDF
    Pervasive web applications increasingly require a measure of similarity among objects. Penetrating-Rank (P-Rank) has been one of the promising link-based similarity metrics as it provides a comprehensive way of jointly encoding both incoming and outgoing links into computation for emerging applications. In this paper, we investigate P-Rank efficiency problem that encompasses its accuracy, stability and computational time. (1) We provide an accuracy estimate for iteratively computing P-Rank. A symmetric problem is to find the iteration number K needed for achieving a given accuracy ε. (2) We also analyze the stability of P-Rank, by showing that small choices of the damping factors would make P-Rank more stable and well-conditioned. (3) For undirected graphs, we also explicitly characterize the P-Rank solution in terms of matrices. This results in a novel non-iterative algorithm, termed ASAP , for efficiently computing P-Rank, which improves the CPU time from O(n 4) to O( n 3 ). Using real and synthetic data, we empirically verify the effectiveness and efficiency of our approaches

    Taming computational complexity: efficient and parallel SimRank optimizations on undirected graphs

    Get PDF
    SimRank has been considered as one of the promising link-based ranking algorithms to evaluate similarities of web documents in many modern search engines. In this paper, we investigate the optimization problem of SimRank similarity computation on undirected web graphs. We first present a novel algorithm to estimate the SimRank between vertices in O(n3+ Kn2) time, where n is the number of vertices, and K is the number of iterations. In comparison, the most efficient implementation of SimRank algorithm in [1] takes O(K n3 ) time in the worst case. To efficiently handle large-scale computations, we also propose a parallel implementation of the SimRank algorithm on multiple processors. The experimental evaluations on both synthetic and real-life data sets demonstrate the better computational time and parallel efficiency of our proposed techniques

    Computational Approaches for Estimating Life Cycle Inventory Data

    Full text link
    Data gaps in life cycle inventory (LCI) are stumbling blocks for investigating the life cycle performance and impact of emerging technologies. It can be tedious, expensive and time consuming for LCI practitioners to collect LCI data or to wait for experime ntal data become available. I propose a computational approach to estimate missing LCI data using link prediction techniques in network science. LCI data in E coinvent 3.1 is used to test the method. The proposed approach is based on the similarities between different processes or environmental intervention s in the LCI database. By comparing two processes’ material inputs and emission outputs, I measure the similarity of these processes. I hypothesize that similar processes tend to have similar material inputs and emission outputs which are life cycle inventory data I want to estimate. In particular, I measure similarity using four metrics, including average difference, Pearson correlation coefficient, Euclidean di stance, and SimRank with or without data normalization . I test these four metrics and normalization method for their performance of estimating missing LCI data. The results show that processes in the same industrial classification have higher similarities, which validat e the approach of measuring the similarity between unit processes. I remove a small set of data (from one data point to 50) for each process and then use the rest of LCI data as to train the model for estimating the removed data. I t is found that approximately 80% of removed data can be successfully estimated with less than 10% errors. This st udy is the first attempt in the searching for an effective computational method for estimating missing LCI data. I t is anticipate d that this approach wil l significantly transform LCI compilation and LCA studies in future.Master of ScienceNatural Resources and EnvironmentUniversity of Michiganhttp://deepblue.lib.umich.edu/bitstream/2027.42/134693/3/Cai_Jiarui_Document.pd

    Link Prediction Based on Local Random Walk

    Get PDF
    The problem of missing link prediction in complex networks has attracted much attention recently. Two difficulties in link prediction are the sparsity and huge size of the target networks. Therefore, the design of an efficient and effective method is of both theoretical interests and practical significance. In this Letter, we proposed a method based on local random walk, which can give competitively good prediction or even better prediction than other random-walk-based methods while has a lower computational complexity.Comment: 6 pages, 2 figure

    LinkCluE: A MATLAB Package for Link-Based Cluster Ensembles

    Get PDF
    Cluster ensembles have emerged as a powerful meta-learning paradigm that provides improved accuracy and robustness by aggregating several input data clusterings. In particular, link-based similarity methods have recently been introduced with superior performance to the conventional co-association approach. This paper presents a MATLAB package, LinkCluE, that implements the link-based cluster ensemble framework. A variety of functional methods for evaluating clustering results, based on both internal and external criteria, are also provided. Additionally, the underlying algorithms together with the sample uses of the package with interesting real and synthetic datasets are demonstrated herein.

    Link Prediction in Complex Networks: A Survey

    Full text link
    Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labelled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure
    corecore