87 research outputs found
The effect of the initial network configuration on preferential attachment
The classical preferential attachment model is sensitive to the choice of the initial configuration of the network. As the number of initial nodes and their degree grow, so does the time needed for an equilibrium degree distribution to be established. We study this phenomenon, provide estimates of the equilibration time, and characterize the degree distribution cutoff observed at finite times. When the initial network is dense and exceeds a certain small size, there is no equilibration and a suitable statistical test can always discern the produced degree distribution from the equilibrium one. As a by-product, the weighted Kolmogorov-Smirnov statistic is demonstrated to be more suitable for statistical analysis of power-law distributions with cutoff when the data is ampl
Algorithmic bias amplification via temporal effects: The case of PageRank in evolving networks
Biases impair the effectiveness of algorithms. For example, the age bias of the widely-used PageRank algorithm impairs its ability to effectively rank nodes in growing networks. PageRank’s temporal bias cannot be fully explained by existing analytic results that predict a linear relation between the expected PageRank score and the indegree of a given node. We show that in evolving networks, under a mean-field approximation, the expected PageRank score of a node can be expressed as the product of the node’s indegree and a previously-neglected age factor which can “amplify” the indegree’s age bias. We use two well-known empirical networks to show that our analytic results explain the observed PageRank’s age bias and, when there is an age bias amplification, they enable estimates of the node PageRank score that are more accurate than estimates based solely on local structural information. Accuracy gains are larger in degree-degree correlated networks, as revealed by a growing directed network model with tunable assortativity. Our approach can be used to analytically study other kinds of ranking bias
Identification of milestone papers through time-balanced network centrality
Citations between scientific papers and related bibliometric indices, such as the h- index for authors and the impact factor for journals, are being increasingly used – often in controversial ways – as quantitative tools for research evaluation. Yet, a fundamental research question remains still open: to which extent do quantitative metrics capture the significance of scientific works? We analyze the network of citations among the 449,935 papers published by the American Physical Society (APS) journals between 1893 and 2009, and focus on the comparison of metrics built on the citation count with network-based metrics. We contrast five article-level metrics with respect to the rankings that they assign to a set of fundamental papers, called Milestone Letters, carefully selected by the APS editors for “making long-lived contributions to physics, either by announcing significant discoveries, or by initiating new areas of research”. A new metric, which combines PageRank centrality with the explicit requirement that paper score is not biased by paper age, is the best-performing metric overall in identifying the Milestone Letters. The lack of time bias in the new metric makes it also possible to use it to compare papers of different age on the same scale. We find that network-based metrics identify the Milestone Letters better than metrics based on the citation count, which suggests that the structure of the citation network contains information that can be used to improve the ranking of scientific publications. The methods and results presented here are relevant for all evolving systems where network centrality metrics are applied, for example the World Wide Web and online social networks. An interactive Web platform where it is possible to view the ranking of the APS papers by rescaled PageRank is available at the address http://www.sciencenow.info
Network-driven reputation in online scientific communities
The ever-increasing quantity and complexity of scientific production have made it difficult for researchers to keep track of advances in their own fields. This, together with growing popularity of online scientific communities, calls for the development of effective information filtering tools. We propose here an algorithm which simultaneously computes reputation of users and fitness of papers in a bipartite network representing an online scientific community. Evaluation on artificially-generated data and real data from the Econophysics Forum is used to determine the method's best-performing variants. We show that when the input data is extended to a multilayer network including users, papers and authors and the algorithm is correspondingly modified, the resulting performance improves on multiple levels. In particular, top papers have higher citation count and top authors have higher h-index than top papers and top authors chosen by other algorithms. We finally show that our algorithm is robust against persistent authors (spammers) which makes the method readily applicable to the existing online scientific communities
The Role of Taste Affinity in Agent-Based Models for Social Recommendation
In the Internet era, online social media emerged as the main tool for sharing opinions and information among individuals. In this work, we study an adaptive model of a social network where directed links connect users with similar tastes, and over which information propagates through social recommendation. Agent-based simulations of two different artificial settings for modeling user tastes are compared with patterns seen in real data, suggesting that users differing in their scope of interests is a more realistic assumption than users differing only in their particular interests. We further introduce an extensive set of similarity metrics based on users' past assessments, and evaluate their use in the given social recommendation model with both artificial simulations and real data. Superior recommendation performance is observed for similarity metrics that give preference to users with small scope — who thus act as selective filters in social recommendation
Ranking nodes in growing networks: When PageRank fails
PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm’s efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank’s performance on a network model supported by real data, and show that realistic temporal effects make PageRank fail in individuating the most valuable nodes for a broad range of model parameters. Results on real data are in qualitative agreement with our model-based findings. This failure of PageRank reveals that the static approach to information filtering is inappropriate for a broad class of growing systems, and suggest that time-dependent algorithms that are based on the temporal linking patterns of these systems are needed to better rank the nodes
- …