23,797 research outputs found

    Uncovering missing links with cold ends

    Get PDF
    To evaluate the performance of prediction of missing links, the known data are randomly divided into two parts, the training set and the probe set. We argue that this straightforward and standard method may lead to terrible bias, since in real biological and information networks, missing links are more likely to be links connecting low-degree nodes. We therefore study how to uncover missing links with low-degree nodes, namely links in the probe set are of lower degree products than a random sampling. Experimental analysis on ten local similarity indices and four disparate real networks reveals a surprising result that the Leicht-Holme-Newman index [E. A. Leicht, P. Holme, and M. E. J. Newman, Phys. Rev. E 73, 026120 (2006)] performs the best, although it was known to be one of the worst indices if the probe set is a random sampling of all links. We further propose an parameter-dependent index, which considerably improves the prediction accuracy. Finally, we show the relevance of the proposed index on three real sampling methods.Comment: 16 pages, 5 figures, 6 table

    Link Prediction in Complex Networks: A Survey

    Full text link
    Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labelled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure

    Predicting missing links via correlation between nodes

    Get PDF
    As a fundamental problem in many different fields, link prediction aims to estimate the likelihood of an existing link between two nodes based on the observed information. Since this problem is related to many applications ranging from uncovering missing data to predicting the evolution of networks, link prediction has been intensively investigated recently and many methods have been proposed so far. The essential challenge of link prediction is to estimate the similarity between nodes. Most of the existing methods are based on the common neighbor index and its variants. In this paper, we propose to calculate the similarity between nodes by the Pearson correlation coefficient. This method is found to be very effective when applied to calculate similarity based on high order paths. We finally fuse the correlation-based method with the resource allocation method, and find that the combined method can substantially outperform the existing methods, especially in sparse networks

    Characterizing the spatial determinants and prevention of malaria in Kenya

    Full text link
    The United Nations' Sustainable Development Goal 3 is to ensure health and well-being for all at all ages with a specific target to end malaria by 2030. Aligned with this goal, the primary objective of this study is to determine the effectiveness of utilizing local spatial variations to uncover the statistical relationships between malaria incidence rate and environmental and behavioral factors across the counties of Kenya. Two data sources are used-Kenya Demographic and Health Surveys of 2000, 2005, 2010, and 2015, and the national Malaria Indicator Survey of 2015. The spatial analysis shows clustering of counties with high malaria incidence rate, or hot spots, in the Lake Victoria region and the east coastal area around Mombasa; there are significant clusters of counties with low incidence rate, or cold spot areas in Nairobi. We apply an analysis technique, geographically weighted regression, that helps to better model how environmental and social determinants are related to malaria incidence rate while accounting for the confounding effects of spatial non-stationarity. Some general patterns persist over the four years of observation. We establish that variables including rainfall, proximity to water, vegetation, and population density, show differential impacts on the incidence of malaria in Kenya. The El-Nino-southern oscillation (ENSO) event in 2015 was significant in driving up malaria in the southern region of Lake Victoria compared with prior time-periods. The applied spatial multivariate clustering analysis indicates the significance of social and behavioral survey responses. This study can help build a better spatially explicit predictive model for malaria in Kenya capturing the role and spatial distribution of environmental, social, behavioral, and other characteristics of the households.Published versio

    Courville Castle [supplemental material]

    Get PDF
    • …
    corecore