38,646 research outputs found
PageRank in scale-free random graphs
We analyze the distribution of PageRank on a directed configuration model and
show that as the size of the graph grows to infinity it can be closely
approximated by the PageRank of the root node of an appropriately constructed
tree. This tree approximation is in turn related to the solution of a linear
stochastic fixed point equation that has been thoroughly studied in the recent
literature
A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal
Degree distributions are arguably the most important property of real world
networks. The classic edge configuration model or Chung-Lu model can generate
an undirected graph with any desired degree distribution. This serves as a good
null model to compare algorithms or perform experimental studies. Furthermore,
there are scalable algorithms that implement these models and they are
invaluable in the study of graphs. However, networks in the real-world are
often directed, and have a significant proportion of reciprocal edges. A
stronger relation exists between two nodes when they each point to one another
(reciprocal edge) as compared to when only one points to the other (one-way
edge). Despite their importance, reciprocal edges have been disregarded by most
directed graph models.
We propose a null model for directed graphs inspired by the Chung-Lu model
that matches the in-, out-, and reciprocal-degree distributions of the real
graphs. Our algorithm is scalable and requires random numbers to
generate a graph with edges. We perform a series of experiments on real
datasets and compare with existing graph models.Comment: Camera ready version for IEEE Workshop on Network Science; fixed some
typos in tabl
Affinity Paths and Information Diffusion in Social Networks
Widespread interest in the diffusion of information through social networks
has produced a large number of Social Dynamics models. A majority of them use
theoretical hypothesis to explain their diffusion mechanisms while the few
empirically based ones average out their measures over many messages of
different content. Our empirical research tracking the step-by-step email
propagation of an invariable viral marketing message delves into the content
impact and has discovered new and striking features. The topology and dynamics
of the propagation cascades display patterns not inherited from the email
networks carrying the message. Their disconnected, low transitivity, tree-like
cascades present positive correlation between their nodes probability to
forward the message and the average number of neighbors they target and show
increased participants' involvement as the propagation paths length grows. Such
patterns not described before, nor replicated by any of the existing models of
information diffusion, can be explained if participants make their pass-along
decisions based uniquely on local knowledge of their network neighbors affinity
with the message content. We prove the plausibility of such mechanism through a
stylized, agent-based model that replicates the \emph{Affinity Paths} observed
in real information diffusion cascades.Comment: 11 pages, 7 figure
Generating realistic scaled complex networks
Research on generative models is a central project in the emerging field of
network science, and it studies how statistical patterns found in real networks
could be generated by formal rules. Output from these generative models is then
the basis for designing and evaluating computational methods on networks, and
for verification and simulation studies. During the last two decades, a variety
of models has been proposed with an ultimate goal of achieving comprehensive
realism for the generated networks. In this study, we (a) introduce a new
generator, termed ReCoN; (b) explore how ReCoN and some existing models can be
fitted to an original network to produce a structurally similar replica, (c)
use ReCoN to produce networks much larger than the original exemplar, and
finally (d) discuss open problems and promising research directions. In a
comparative experimental study, we find that ReCoN is often superior to many
other state-of-the-art network generation methods. We argue that ReCoN is a
scalable and effective tool for modeling a given network while preserving
important properties at both micro- and macroscopic scales, and for scaling the
exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the
paper was presented at the 5th International Workshop on Complex Networks and
their Application
Non mean reverting affine processes for stochastic mortality.
In this paper we use doubly stochastic processes (or Cox processes) in order to model the random evolution of mortality of an individual. These processes have been widely used in the credit risk literature in modelling default arrival, and in this context have proved to be quite flexible, especially when the intensity process is of the affine class. We investigate the applicability of affine processes in describing the individual's intensity of mortality, and provide a calibration to the Italian and UK populations. Results from the calibration seem to suggest that, in spite of their popularity in the financial context, mean reverting processes are not suitable for describing the death intensity of individuals. On the contrary, affine processes whose deterministic part increases exponentially seem to be appropriate. As for the stochastic part, negative jumps seem to do a better job than diffusive components. Stress analysis and analytical results indicate that increasing the randomness of the intensity process results in improvements in survivorship.doubly stochastic processes (Cox processes); stochastic mortality; affine processes
Information Ranking and Power Laws on Trees
We study the situations when the solution to a weighted stochastic recursion
has a power law tail. To this end, we develop two complementary approaches, the
first one extends Goldie's (1991) implicit renewal theorem to cover recursions
on trees; and the second one is based on a direct sample path large deviations
analysis of weighted recursive random sums. We believe that these methods may
be of independent interest in the analysis of more general weighted branching
processes as well as in the analysis of algorithms
Synthetic sequence generator for recommender systems - memory biased random walk on sequence multilayer network
Personalized recommender systems rely on each user's personal usage data in
the system, in order to assist in decision making. However, privacy policies
protecting users' rights prevent these highly personal data from being publicly
available to a wider researcher audience. In this work, we propose a memory
biased random walk model on multilayer sequence network, as a generator of
synthetic sequential data for recommender systems. We demonstrate the
applicability of the synthetic data in training recommender system models for
cases when privacy policies restrict clickstream publishing.Comment: The new updated version of the pape
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Ranking algorithms on directed configuration networks
This paper studies the distribution of a family of rankings, which includes
Google's PageRank, on a directed configuration model. In particular, it is
shown that the distribution of the rank of a randomly chosen node in the graph
converges in distribution to a finite random variable that can
be written as a linear combination of i.i.d. copies of the endogenous solution
to a stochastic fixed point equation of the form where is a
real-valued vector with , , and the are i.i.d. copies of ,
independent of . Moreover, we
provide precise asymptotics for the limit , which when the
in-degree distribution in the directed configuration model has a power law
imply a power law distribution for with the same exponent
- âŠ