47 research outputs found
Promise and Pitfalls of Extending Google's PageRank Algorithm to Citation Networks
We review our recent work on applying the Google PageRank algorithm to find
scientific "gems" among all Physical Review publications, and its extension to
CiteRank, to find currently popular research directions. These metrics provide
a meaningful extension to traditionally-used importance measures, such as the
number of citations and journal impact factor. We also point out some pitfalls
of over-relying on quantitative metrics to evaluate scientific quality.Comment: 3 pages, 1 figure, invited comment for the Journal of Neuroscience.
The arxiv version is microscopically different from the published versio
Rich-club and page-club coefficients for directed graphs
Rich-club and page-club coefficients and their null models are introduced for
directed graphs. Null models allow for a quantitative discussion of the
rich-club and page-club phenomena. These coefficients are computed for four
directed real-world networks: Arxiv High Energy Physics paper citation network,
Web network (released from Google), Citation network among US Patents, and
Email network from a EU research institution. The results show a high
correlation between rich-club and page-club ordering. For journal paper
citation network, we identify both rich-club and page-club ordering, showing
that {}"elite" papers are cited by other {}"elite" papers. Google web network
shows partial rich-club and page-club ordering up to some point and then a
narrow declining of the corresponding normalized coefficients, indicating the
lack of rich-club ordering and the lack of page-club ordering, i.e. high
in-degree (PageRank) pages purposely avoid sharing links with other high
in-degree (PageRank) pages. For UC patents citation network, we identify
page-club and rich-club ordering providing a conclusion that {}"elite" patents
are cited by other {}"elite" patents. Finally, for e-mail communication network
we show lack of both rich-club and page-club ordering. We construct an example
of synthetic network showing page-club ordering and the lack of rich-club
ordering.Comment: 18 pages, 6 figure
Determining factors behind the PageRank log-log plot
We study the relation between PageRank and other parameters of information
networks such as in-degree, out-degree, and the fraction of dangling nodes. We
model this relation through a stochastic equation inspired by the original
definition of PageRank. Further, we use the theory of regular variation to
prove that PageRank and in-degree follow power laws with the same exponent. The
difference between these two power laws is in a multiple coefficient, which
depends mainly on the fraction of dangling nodes, average in-degree, the power
law exponent, and damping factor. The out-degree distribution has a minor
effect, which we explicitly quantify. Our theoretical predictions show a good
agreement with experimental data on three different samples of the Web
Network-based ranking in social systems: three challenges
Ranking algorithms are pervasive in our increasingly digitized societies,
with important real-world applications including recommender systems, search
engines, and influencer marketing practices. From a network science
perspective, network-based ranking algorithms solve fundamental problems
related to the identification of vital nodes for the stability and dynamics of
a complex system. Despite the ubiquitous and successful applications of these
algorithms, we argue that our understanding of their performance and their
applications to real-world problems face three fundamental challenges: (i)
Rankings might be biased by various factors; (2) their effectiveness might be
limited to specific problems; and (3) agents' decisions driven by rankings
might result in potentially vicious feedback mechanisms and unhealthy systemic
consequences. Methods rooted in network science and agent-based modeling can
help us to understand and overcome these challenges.Comment: Perspective article. 9 pages, 3 figure
Tackling information asymmetry in networks: a new entropy-based ranking index
Information is a valuable asset for agents in socio-economic systems, a
significant part of the information being entailed into the very network of
connections between agents. The different interlinkages patterns that agents
establish may, in fact, lead to asymmetries in the knowledge of the network
structure; since this entails a different ability of quantifying relevant
systemic properties (e.g. the risk of financial contagion in a network of
liabilities), agents capable of providing a better estimate of (otherwise)
unaccessible network properties, ultimately have a competitive advantage. In
this paper, we address for the first time the issue of quantifying the
information asymmetry arising from the network topology. To this aim, we define
a novel index - InfoRank - intended to measure the quality of the information
possessed by each node, computing the Shannon entropy of the ensemble
conditioned on the node-specific information. Further, we test the performance
of our novel ranking procedure in terms of the reconstruction accuracy of the
(unaccessible) network structure and show that it outperforms other popular
centrality measures in identifying the "most informative" nodes. Finally, we
discuss the socio-economic implications of network information asymmetry.Comment: 12 pages, 8 figure
Two types of well followed users in the followership networks of Twitter
In the Twitter blogosphere, the number of followers is probably the most
basic and succinct quantity for measuring popularity of users. However, the
number of followers can be manipulated in various ways; we can even buy
follows. Therefore, alternative popularity measures for Twitter users on the
basis of, for example, users' tweets and retweets, have been developed. In the
present work, we take a purely network approach to this fundamental question.
First, we find that two relatively distinct types of users possessing a large
number of followers exist, in particular for Japanese, Russian, and Korean
users among the seven language groups that we examined. A first type of user
follows a small number of other users. A second type of user follows
approximately the same number of other users as the number of follows that the
user receives. Then, we compare local (i.e., egocentric) followership networks
around the two types of users with many followers. We show that the second
type, which is presumably uninfluential users despite its large number of
followers, is characterized by high link reciprocity, a large number of friends
(i.e., those whom a user follows) for the followers, followers' high link
reciprocity, large clustering coefficient, large fraction of the second type of
users among the followers, and a small PageRank. Our network-based results
support that the number of followers used alone is a misleading measure of
user's popularity. We propose that the number of friends, which is simple to
measure, also helps us to assess the popularity of Twitter users.Comment: 4 Figures and 8 Table